Overview

Brought to you by YData

Dataset statistics

Number of variables19
Number of observations4500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.7 MiB
Average record size in memory1.1 KiB

Variable types

Text6
Numeric3
DateTime1
Categorical9

Alerts

cluster is highly overall correlated with patientincomeHigh correlation
patientincome is highly overall correlated with clusterHigh correlation
claimlegitimacy is highly imbalanced (67.3%) Imbalance
claimid has unique values Unique
patientid has unique values Unique
providerid has unique values Unique
patientincome has unique values Unique

Reproduction

Analysis started2025-07-10 12:50:30.366471
Analysis finished2025-07-10 12:50:54.796896
Duration24.43 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

claimid
Text

Unique 

Distinct4500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size408.8 KiB
2025-07-10T18:20:54.957175image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters162000
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4500 ?
Unique (%)100.0%

Sample

1st row4d76c7f7-d36a-4139-b451-a9a4ad10d7d5
2nd rowe35193b4-3609-492b-866a-98de19317e9c
3rd row1f3fa373-25ed-4ff4-b6c7-38dcb2fb297f
4th rowaf6a68f4-8319-47b1-a28b-77de01572851
5th row417fe944-79d2-4610-81c4-a2d496f29ee4
ValueCountFrequency (%)
417fe944-79d2-4610-81c4-a2d496f29ee4 1
 
< 0.1%
291cfa64-9956-40e7-b89f-4628650f42f0 1
 
< 0.1%
4d76c7f7-d36a-4139-b451-a9a4ad10d7d5 1
 
< 0.1%
e35193b4-3609-492b-866a-98de19317e9c 1
 
< 0.1%
1492c9c7-e184-413d-b951-f4377400782f 1
 
< 0.1%
a1684758-40b1-4f1d-8f5b-409c7228dbac 1
 
< 0.1%
2c2ed3f4-90c6-4681-94b4-d20278d85963 1
 
< 0.1%
379d7c46-3096-4741-9d42-26f540347070 1
 
< 0.1%
ab6b425f-957e-4448-a715-97d8aabddb6d 1
 
< 0.1%
e1464b6a-4ea4-4fa1-952d-e16ebdd032c5 1
 
< 0.1%
Other values (4490) 4490
99.8%
2025-07-10T18:20:55.242229image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 18000
 
11.1%
4 12975
 
8.0%
8 9645
 
6.0%
a 9608
 
5.9%
b 9564
 
5.9%
9 9537
 
5.9%
6 8529
 
5.3%
f 8511
 
5.3%
e 8473
 
5.2%
2 8464
 
5.2%
Other values (7) 58694
36.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12975
 
8.0%
8 9645
 
6.0%
a 9608
 
5.9%
b 9564
 
5.9%
9 9537
 
5.9%
6 8529
 
5.3%
f 8511
 
5.3%
e 8473
 
5.2%
2 8464
 
5.2%
Other values (7) 58694
36.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12975
 
8.0%
8 9645
 
6.0%
a 9608
 
5.9%
b 9564
 
5.9%
9 9537
 
5.9%
6 8529
 
5.3%
f 8511
 
5.3%
e 8473
 
5.2%
2 8464
 
5.2%
Other values (7) 58694
36.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12975
 
8.0%
8 9645
 
6.0%
a 9608
 
5.9%
b 9564
 
5.9%
9 9537
 
5.9%
6 8529
 
5.3%
f 8511
 
5.3%
e 8473
 
5.2%
2 8464
 
5.2%
Other values (7) 58694
36.2%

patientid
Text

Unique 

Distinct4500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size408.8 KiB
2025-07-10T18:20:55.403795image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters162000
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4500 ?
Unique (%)100.0%

Sample

1st row19cf2638-3ec0-4ed9-9995-d9ba4553813a
2nd row5c4bb6c5-4dd3-4a86-85fa-f36c0d8debff
3rd row777866e0-4d10-45a8-a7b4-dbdaa26d5a81
4th row9d7c53ee-eb1a-4f07-9e3a-e86cf82e9f0f
5th rowdb14b0ca-ac2a-4e83-b085-947ea32e7587
ValueCountFrequency (%)
db14b0ca-ac2a-4e83-b085-947ea32e7587 1
 
< 0.1%
2bd2d173-4ce1-428d-836c-259d9236a839 1
 
< 0.1%
19cf2638-3ec0-4ed9-9995-d9ba4553813a 1
 
< 0.1%
5c4bb6c5-4dd3-4a86-85fa-f36c0d8debff 1
 
< 0.1%
fb07a807-4dcc-4e09-bea6-4ca54acf6add 1
 
< 0.1%
638c3542-dc16-4507-95f8-a1bb0c425624 1
 
< 0.1%
bce42931-4ff7-487b-b373-773bdb57241b 1
 
< 0.1%
09a26428-831a-4d5f-bd9f-ee790468aae5 1
 
< 0.1%
67f76baf-3c23-45ee-8898-ec4a25c85e11 1
 
< 0.1%
72f46521-1f31-4707-bbd9-4760af6d9d5c 1
 
< 0.1%
Other values (4490) 4490
99.8%
2025-07-10T18:20:55.673636image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 18000
 
11.1%
4 12931
 
8.0%
8 9664
 
6.0%
9 9593
 
5.9%
a 9583
 
5.9%
b 9565
 
5.9%
d 8596
 
5.3%
e 8508
 
5.3%
5 8484
 
5.2%
6 8474
 
5.2%
Other values (7) 58602
36.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12931
 
8.0%
8 9664
 
6.0%
9 9593
 
5.9%
a 9583
 
5.9%
b 9565
 
5.9%
d 8596
 
5.3%
e 8508
 
5.3%
5 8484
 
5.2%
6 8474
 
5.2%
Other values (7) 58602
36.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12931
 
8.0%
8 9664
 
6.0%
9 9593
 
5.9%
a 9583
 
5.9%
b 9565
 
5.9%
d 8596
 
5.3%
e 8508
 
5.3%
5 8484
 
5.2%
6 8474
 
5.2%
Other values (7) 58602
36.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12931
 
8.0%
8 9664
 
6.0%
9 9593
 
5.9%
a 9583
 
5.9%
b 9565
 
5.9%
d 8596
 
5.3%
e 8508
 
5.3%
5 8484
 
5.2%
6 8474
 
5.2%
Other values (7) 58602
36.2%

providerid
Text

Unique 

Distinct4500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size408.8 KiB
2025-07-10T18:20:55.832538image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters162000
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4500 ?
Unique (%)100.0%

Sample

1st rowa3d0cc80-dffe-40ff-a302-23c8ffeedb36
2nd rowa9f25acf-92b8-45e2-9cef-87bd07d0a591
3rd row951b1e08-9948-4956-80e5-9277f16bd290
4th rowde9e193a-f9a1-4d63-9345-aefe75694628
5th row5c7d7045-71b6-4c15-937c-43e4cfe65bf4
ValueCountFrequency (%)
5c7d7045-71b6-4c15-937c-43e4cfe65bf4 1
 
< 0.1%
cf84cf99-0ac3-465a-af90-239a873bafa5 1
 
< 0.1%
a3d0cc80-dffe-40ff-a302-23c8ffeedb36 1
 
< 0.1%
a9f25acf-92b8-45e2-9cef-87bd07d0a591 1
 
< 0.1%
4cbf206c-b046-40d6-b953-927d2ed77950 1
 
< 0.1%
20685b18-4e11-4a78-b714-1d5e70610385 1
 
< 0.1%
9e339ef9-c22f-4a42-b299-857fbbc1fa81 1
 
< 0.1%
4d2bab66-8b53-40b4-9724-c8c77522e8c3 1
 
< 0.1%
866b8799-1436-4c1b-ae34-37e1fda57c99 1
 
< 0.1%
e84f0876-375d-40f4-a112-46c24ac59627 1
 
< 0.1%
Other values (4490) 4490
99.8%
2025-07-10T18:20:56.097433image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 18000
 
11.1%
4 12912
 
8.0%
8 9676
 
6.0%
9 9606
 
5.9%
a 9544
 
5.9%
b 9384
 
5.8%
f 8602
 
5.3%
e 8574
 
5.3%
1 8547
 
5.3%
5 8485
 
5.2%
Other values (7) 58670
36.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12912
 
8.0%
8 9676
 
6.0%
9 9606
 
5.9%
a 9544
 
5.9%
b 9384
 
5.8%
f 8602
 
5.3%
e 8574
 
5.3%
1 8547
 
5.3%
5 8485
 
5.2%
Other values (7) 58670
36.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12912
 
8.0%
8 9676
 
6.0%
9 9606
 
5.9%
a 9544
 
5.9%
b 9384
 
5.8%
f 8602
 
5.3%
e 8574
 
5.3%
1 8547
 
5.3%
5 8485
 
5.2%
Other values (7) 58670
36.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 162000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 18000
 
11.1%
4 12912
 
8.0%
8 9676
 
6.0%
9 9606
 
5.9%
a 9544
 
5.9%
b 9384
 
5.8%
f 8602
 
5.3%
e 8574
 
5.3%
1 8547
 
5.3%
5 8485
 
5.2%
Other values (7) 58670
36.2%

claimamount
Real number (ℝ)

Distinct4490
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5014.2039
Minimum100.12
Maximum9997.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.3 KiB
2025-07-10T18:20:56.223493image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum100.12
5-th percentile590.725
Q12509.0725
median5053.765
Q37462.4525
95-th percentile9510.307
Maximum9997.2
Range9897.08
Interquartile range (IQR)4953.38

Descriptive statistics

Standard deviation2866.2911
Coefficient of variation (CV)0.57163433
Kurtosis-1.2029995
Mean5014.2039
Median Absolute Deviation (MAD)2483.075
Skewness0.00042447153
Sum22563917
Variance8215624.5
MonotonicityNot monotonic
2025-07-10T18:20:56.359629image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5540.34 2
 
< 0.1%
862.1 2
 
< 0.1%
6116.75 2
 
< 0.1%
8936.33 2
 
< 0.1%
7946.69 2
 
< 0.1%
9963.71 2
 
< 0.1%
4153.18 2
 
< 0.1%
8414.04 2
 
< 0.1%
6118.26 2
 
< 0.1%
6834.26 2
 
< 0.1%
Other values (4480) 4480
99.6%
ValueCountFrequency (%)
100.12 1
< 0.1%
100.3 1
< 0.1%
101.33 1
< 0.1%
106.47 1
< 0.1%
111.01 1
< 0.1%
113.4 1
< 0.1%
114.59 1
< 0.1%
115.49 1
< 0.1%
119.72 1
< 0.1%
131.86 1
< 0.1%
ValueCountFrequency (%)
9997.2 1
< 0.1%
9995.62 1
< 0.1%
9994.2 1
< 0.1%
9989.04 1
< 0.1%
9983.64 1
< 0.1%
9979.55 1
< 0.1%
9978.43 1
< 0.1%
9977.72 1
< 0.1%
9976.52 1
< 0.1%
9972.66 1
< 0.1%
Distinct731
Distinct (%)16.2%
Missing0
Missing (%)0.0%
Memory size35.3 KiB
Minimum2022-07-09 00:00:00
Maximum2024-07-08 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-07-10T18:20:56.487804image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:56.616312image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct4495
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size272.6 KiB
2025-07-10T18:20:56.866021image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters22500
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4490 ?
Unique (%)99.8%

Sample

1st rowTa150
2nd rowFo766
3rd rowAX876
4th rowSQ441
5th rowFK970
ValueCountFrequency (%)
ir323 2
 
< 0.1%
oc742 2
 
< 0.1%
tk486 2
 
< 0.1%
pa477 2
 
< 0.1%
ae034 2
 
< 0.1%
me712 2
 
< 0.1%
rs522 2
 
< 0.1%
yl726 2
 
< 0.1%
ej032 2
 
< 0.1%
xa248 2
 
< 0.1%
Other values (4476) 4480
99.6%
2025-07-10T18:20:57.338519image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 1388
 
6.2%
2 1387
 
6.2%
7 1383
 
6.1%
5 1382
 
6.1%
9 1364
 
6.1%
6 1354
 
6.0%
3 1339
 
6.0%
1 1316
 
5.8%
8 1300
 
5.8%
0 1287
 
5.7%
Other values (52) 9000
40.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22500
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 1388
 
6.2%
2 1387
 
6.2%
7 1383
 
6.1%
5 1382
 
6.1%
9 1364
 
6.1%
6 1354
 
6.0%
3 1339
 
6.0%
1 1316
 
5.8%
8 1300
 
5.8%
0 1287
 
5.7%
Other values (52) 9000
40.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22500
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 1388
 
6.2%
2 1387
 
6.2%
7 1383
 
6.1%
5 1382
 
6.1%
9 1364
 
6.1%
6 1354
 
6.0%
3 1339
 
6.0%
1 1316
 
5.8%
8 1300
 
5.8%
0 1287
 
5.7%
Other values (52) 9000
40.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22500
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 1388
 
6.2%
2 1387
 
6.2%
7 1383
 
6.1%
5 1382
 
6.1%
9 1364
 
6.1%
6 1354
 
6.0%
3 1339
 
6.0%
1 1316
 
5.8%
8 1300
 
5.8%
0 1287
 
5.7%
Other values (52) 9000
40.0%
Distinct4495
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size272.6 KiB
2025-07-10T18:20:57.585207image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters22500
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4491 ?
Unique (%)99.8%

Sample

1st rowiO013
2nd rowjR349
3rd rowuU479
4th rowXs264
5th rowPV476
ValueCountFrequency (%)
zw098 3
 
0.1%
uu155 2
 
< 0.1%
zf251 2
 
< 0.1%
ty099 2
 
< 0.1%
tz954 2
 
< 0.1%
yg753 2
 
< 0.1%
jw378 2
 
< 0.1%
nr774 2
 
< 0.1%
ln407 2
 
< 0.1%
mt610 2
 
< 0.1%
Other values (4475) 4479
99.5%
2025-07-10T18:20:57.933681image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1419
 
6.3%
9 1395
 
6.2%
1 1385
 
6.2%
3 1369
 
6.1%
7 1368
 
6.1%
0 1350
 
6.0%
6 1344
 
6.0%
4 1309
 
5.8%
8 1296
 
5.8%
2 1265
 
5.6%
Other values (52) 9000
40.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22500
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
5 1419
 
6.3%
9 1395
 
6.2%
1 1385
 
6.2%
3 1369
 
6.1%
7 1368
 
6.1%
0 1350
 
6.0%
6 1344
 
6.0%
4 1309
 
5.8%
8 1296
 
5.8%
2 1265
 
5.6%
Other values (52) 9000
40.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22500
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
5 1419
 
6.3%
9 1395
 
6.2%
1 1385
 
6.2%
3 1369
 
6.1%
7 1368
 
6.1%
0 1350
 
6.0%
6 1344
 
6.0%
4 1309
 
5.8%
8 1296
 
5.8%
2 1265
 
5.6%
Other values (52) 9000
40.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22500
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
5 1419
 
6.3%
9 1395
 
6.2%
1 1385
 
6.2%
3 1369
 
6.1%
7 1368
 
6.1%
0 1350
 
6.0%
6 1344
 
6.0%
4 1309
 
5.8%
8 1296
 
5.8%
2 1265
 
5.6%
Other values (52) 9000
40.0%

patientage
Real number (ℝ)

Distinct100
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.838444
Minimum0
Maximum99
Zeros45
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size35.3 KiB
2025-07-10T18:20:58.072081image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q125
median50.5
Q375
95-th percentile95
Maximum99
Range99
Interquartile range (IQR)50

Descriptive statistics

Standard deviation28.790471
Coefficient of variation (CV)0.57767595
Kurtosis-1.2092364
Mean49.838444
Median Absolute Deviation (MAD)25.5
Skewness-0.02178574
Sum224273
Variance828.89121
MonotonicityNot monotonic
2025-07-10T18:20:58.210759image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57 64
 
1.4%
25 59
 
1.3%
70 58
 
1.3%
1 57
 
1.3%
16 56
 
1.2%
81 56
 
1.2%
48 55
 
1.2%
79 55
 
1.2%
76 54
 
1.2%
97 54
 
1.2%
Other values (90) 3932
87.4%
ValueCountFrequency (%)
0 45
1.0%
1 57
1.3%
2 44
1.0%
3 39
0.9%
4 33
0.7%
5 38
0.8%
6 34
0.8%
7 33
0.7%
8 49
1.1%
9 51
1.1%
ValueCountFrequency (%)
99 46
1.0%
98 44
1.0%
97 54
1.2%
96 45
1.0%
95 40
0.9%
94 35
0.8%
93 40
0.9%
92 49
1.1%
91 45
1.0%
90 35
0.8%

patientgender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size255.0 KiB
F
2282 
M
2218 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4500
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowM
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
F 2282
50.7%
M 2218
49.3%

Length

2025-07-10T18:20:58.336921image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:20:58.434114image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
f 2282
50.7%
m 2218
49.3%

Most occurring characters

ValueCountFrequency (%)
F 2282
50.7%
M 2218
49.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4500
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 2282
50.7%
M 2218
49.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4500
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 2282
50.7%
M 2218
49.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4500
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 2282
50.7%
M 2218
49.3%
Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size299.7 KiB
Pediatrics
955 
Cardiology
907 
Orthopedics
893 
General Practice
880 
Neurology
865 

Length

Max length16
Median length11
Mean length11.179556
Min length9

Characters and Unicode

Total characters50308
Distinct characters22
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOrthopedics
2nd rowCardiology
3rd rowCardiology
4th rowCardiology
5th rowNeurology

Common Values

ValueCountFrequency (%)
Pediatrics 955
21.2%
Cardiology 907
20.2%
Orthopedics 893
19.8%
General Practice 880
19.6%
Neurology 865
19.2%

Length

2025-07-10T18:20:58.544074image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:20:58.650376image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
pediatrics 955
17.8%
cardiology 907
16.9%
orthopedics 893
16.6%
general 880
16.4%
practice 880
16.4%
neurology 865
16.1%

Most occurring characters

ValueCountFrequency (%)
r 5380
10.7%
e 5353
10.6%
i 4590
 
9.1%
o 4437
 
8.8%
a 3622
 
7.2%
c 3608
 
7.2%
d 2755
 
5.5%
t 2728
 
5.4%
l 2652
 
5.3%
s 1848
 
3.7%
Other values (12) 13335
26.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 50308
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 5380
10.7%
e 5353
10.6%
i 4590
 
9.1%
o 4437
 
8.8%
a 3622
 
7.2%
c 3608
 
7.2%
d 2755
 
5.5%
t 2728
 
5.4%
l 2652
 
5.3%
s 1848
 
3.7%
Other values (12) 13335
26.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 50308
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 5380
10.7%
e 5353
10.6%
i 4590
 
9.1%
o 4437
 
8.8%
a 3622
 
7.2%
c 3608
 
7.2%
d 2755
 
5.5%
t 2728
 
5.4%
l 2652
 
5.3%
s 1848
 
3.7%
Other values (12) 13335
26.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 50308
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 5380
10.7%
e 5353
10.6%
i 4590
 
9.1%
o 4437
 
8.8%
a 3622
 
7.2%
c 3608
 
7.2%
d 2755
 
5.5%
t 2728
 
5.4%
l 2652
 
5.3%
s 1848
 
3.7%
Other values (12) 13335
26.5%

claimstatus
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size281.4 KiB
Approved
1522 
Denied
1512 
Pending
1466 

Length

Max length8
Median length7
Mean length7.0022222
Min length6

Characters and Unicode

Total characters31510
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPending
2nd rowDenied
3rd rowPending
4th rowPending
5th rowApproved

Common Values

ValueCountFrequency (%)
Approved 1522
33.8%
Denied 1512
33.6%
Pending 1466
32.6%

Length

2025-07-10T18:20:58.780659image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:20:58.877423image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
approved 1522
33.8%
denied 1512
33.6%
pending 1466
32.6%

Most occurring characters

ValueCountFrequency (%)
e 6012
19.1%
d 4500
14.3%
n 4444
14.1%
p 3044
9.7%
i 2978
9.5%
o 1522
 
4.8%
A 1522
 
4.8%
r 1522
 
4.8%
v 1522
 
4.8%
D 1512
 
4.8%
Other values (2) 2932
9.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 31510
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6012
19.1%
d 4500
14.3%
n 4444
14.1%
p 3044
9.7%
i 2978
9.5%
o 1522
 
4.8%
A 1522
 
4.8%
r 1522
 
4.8%
v 1522
 
4.8%
D 1512
 
4.8%
Other values (2) 2932
9.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 31510
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6012
19.1%
d 4500
14.3%
n 4444
14.1%
p 3044
9.7%
i 2978
9.5%
o 1522
 
4.8%
A 1522
 
4.8%
r 1522
 
4.8%
v 1522
 
4.8%
D 1512
 
4.8%
Other values (2) 2932
9.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 31510
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6012
19.1%
d 4500
14.3%
n 4444
14.1%
p 3044
9.7%
i 2978
9.5%
o 1522
 
4.8%
A 1522
 
4.8%
r 1522
 
4.8%
v 1522
 
4.8%
D 1512
 
4.8%
Other values (2) 2932
9.3%

patientincome
Real number (ℝ)

High correlation  Unique 

Distinct4500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean84384.284
Minimum20006.87
Maximum149957.52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size35.3 KiB
2025-07-10T18:20:58.996417image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum20006.87
5-th percentile26070.636
Q152791.905
median84061.205
Q3115768.42
95-th percentile142561.27
Maximum149957.52
Range129950.65
Interquartile range (IQR)62976.513

Descriptive statistics

Standard deviation37085.909
Coefficient of variation (CV)0.43948834
Kurtosis-1.1707225
Mean84384.284
Median Absolute Deviation (MAD)31383.245
Skewness0.015295257
Sum3.7972928 × 108
Variance1.3753646 × 109
MonotonicityNot monotonic
2025-07-10T18:20:59.122324image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
131676.02 1
 
< 0.1%
57595.11 1
 
< 0.1%
140772.72 1
 
< 0.1%
69803.19 1
 
< 0.1%
138895.98 1
 
< 0.1%
96529.57 1
 
< 0.1%
28830.41 1
 
< 0.1%
111654.49 1
 
< 0.1%
20440.42 1
 
< 0.1%
131764.74 1
 
< 0.1%
Other values (4490) 4490
99.8%
ValueCountFrequency (%)
20006.87 1
< 0.1%
20031.31 1
< 0.1%
20031.58 1
< 0.1%
20053.34 1
< 0.1%
20093.19 1
< 0.1%
20102.64 1
< 0.1%
20117.76 1
< 0.1%
20122.6 1
< 0.1%
20166.98 1
< 0.1%
20278.32 1
< 0.1%
ValueCountFrequency (%)
149957.52 1
< 0.1%
149935.67 1
< 0.1%
149913.57 1
< 0.1%
149857.61 1
< 0.1%
149837.5 1
< 0.1%
149820.25 1
< 0.1%
149819.49 1
< 0.1%
149812.83 1
< 0.1%
149794.76 1
< 0.1%
149728.97 1
< 0.1%
Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size281.4 KiB
Married
1181 
Widowed
1127 
Divorced
1101 
Single
1091 

Length

Max length8
Median length7
Mean length7.0022222
Min length6

Characters and Unicode

Total characters31510
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSingle
2nd rowWidowed
3rd rowMarried
4th rowMarried
5th rowDivorced

Common Values

ValueCountFrequency (%)
Married 1181
26.2%
Widowed 1127
25.0%
Divorced 1101
24.5%
Single 1091
24.2%

Length

2025-07-10T18:20:59.249663image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:20:59.345275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
married 1181
26.2%
widowed 1127
25.0%
divorced 1101
24.5%
single 1091
24.2%

Most occurring characters

ValueCountFrequency (%)
d 4536
14.4%
i 4500
14.3%
e 4500
14.3%
r 3463
11.0%
o 2228
 
7.1%
a 1181
 
3.7%
M 1181
 
3.7%
W 1127
 
3.6%
w 1127
 
3.6%
D 1101
 
3.5%
Other values (6) 6566
20.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 31510
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
d 4536
14.4%
i 4500
14.3%
e 4500
14.3%
r 3463
11.0%
o 2228
 
7.1%
a 1181
 
3.7%
M 1181
 
3.7%
W 1127
 
3.6%
w 1127
 
3.6%
D 1101
 
3.5%
Other values (6) 6566
20.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 31510
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
d 4536
14.4%
i 4500
14.3%
e 4500
14.3%
r 3463
11.0%
o 2228
 
7.1%
a 1181
 
3.7%
M 1181
 
3.7%
W 1127
 
3.6%
w 1127
 
3.6%
D 1101
 
3.5%
Other values (6) 6566
20.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 31510
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
d 4536
14.4%
i 4500
14.3%
e 4500
14.3%
r 3463
11.0%
o 2228
 
7.1%
a 1181
 
3.7%
M 1181
 
3.7%
W 1127
 
3.6%
w 1127
 
3.6%
D 1101
 
3.5%
Other values (6) 6566
20.8%
Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size285.9 KiB
Employed
1188 
Unemployed
1141 
Student
1110 
Retired
1061 

Length

Max length10
Median length8
Mean length8.0246667
Min length7

Characters and Unicode

Total characters36111
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEmployed
2nd rowEmployed
3rd rowStudent
4th rowEmployed
5th rowUnemployed

Common Values

ValueCountFrequency (%)
Employed 1188
26.4%
Unemployed 1141
25.4%
Student 1110
24.7%
Retired 1061
23.6%

Length

2025-07-10T18:20:59.456228image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:20:59.560176image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
employed 1188
26.4%
unemployed 1141
25.4%
student 1110
24.7%
retired 1061
23.6%

Most occurring characters

ValueCountFrequency (%)
e 6702
18.6%
d 4500
12.5%
t 3281
9.1%
m 2329
 
6.4%
y 2329
 
6.4%
l 2329
 
6.4%
o 2329
 
6.4%
p 2329
 
6.4%
n 2251
 
6.2%
E 1188
 
3.3%
Other values (6) 6544
18.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 36111
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6702
18.6%
d 4500
12.5%
t 3281
9.1%
m 2329
 
6.4%
y 2329
 
6.4%
l 2329
 
6.4%
o 2329
 
6.4%
p 2329
 
6.4%
n 2251
 
6.2%
E 1188
 
3.3%
Other values (6) 6544
18.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 36111
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6702
18.6%
d 4500
12.5%
t 3281
9.1%
m 2329
 
6.4%
y 2329
 
6.4%
l 2329
 
6.4%
o 2329
 
6.4%
p 2329
 
6.4%
n 2251
 
6.2%
E 1188
 
3.3%
Other values (6) 6544
18.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 36111
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6702
18.6%
d 4500
12.5%
t 3281
9.1%
m 2329
 
6.4%
y 2329
 
6.4%
l 2329
 
6.4%
o 2329
 
6.4%
p 2329
 
6.4%
n 2251
 
6.2%
E 1188
 
3.3%
Other values (6) 6544
18.1%
Distinct3876
Distinct (%)86.1%
Missing0
Missing (%)0.0%
Memory size303.6 KiB
2025-07-10T18:20:59.748543image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length22
Median length19
Mean length12.056222
Min length6

Characters and Unicode

Total characters54253
Distinct characters50
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3417 ?
Unique (%)75.9%

Sample

1st rowNew Alishaview
2nd rowEast Curtis
3rd rowLake Jennifer
4th rowMartinstad
5th rowThomasfurt
ValueCountFrequency (%)
east 336
 
5.0%
north 333
 
4.9%
south 330
 
4.9%
lake 324
 
4.8%
west 317
 
4.7%
port 304
 
4.5%
new 298
 
4.4%
michael 32
 
0.5%
jennifer 20
 
0.3%
james 20
 
0.3%
Other values (3077) 4428
65.7%
2025-07-10T18:21:00.087820image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5232
 
9.6%
t 4293
 
7.9%
a 4276
 
7.9%
r 4221
 
7.8%
o 3682
 
6.8%
h 3002
 
5.5%
n 2964
 
5.5%
i 2682
 
4.9%
s 2615
 
4.8%
2242
 
4.1%
Other values (40) 19044
35.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 54253
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 5232
 
9.6%
t 4293
 
7.9%
a 4276
 
7.9%
r 4221
 
7.8%
o 3682
 
6.8%
h 3002
 
5.5%
n 2964
 
5.5%
i 2682
 
4.9%
s 2615
 
4.8%
2242
 
4.1%
Other values (40) 19044
35.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 54253
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 5232
 
9.6%
t 4293
 
7.9%
a 4276
 
7.9%
r 4221
 
7.8%
o 3682
 
6.8%
h 3002
 
5.5%
n 2964
 
5.5%
i 2682
 
4.9%
s 2615
 
4.8%
2242
 
4.1%
Other values (40) 19044
35.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 54253
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 5232
 
9.6%
t 4293
 
7.9%
a 4276
 
7.9%
r 4221
 
7.8%
o 3682
 
6.8%
h 3002
 
5.5%
n 2964
 
5.5%
i 2682
 
4.9%
s 2615
 
4.8%
2242
 
4.1%
Other values (40) 19044
35.1%

claimtype
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size289.0 KiB
Outpatient
1152 
Routine
1149 
Inpatient
1128 
Emergency
1071 

Length

Max length10
Median length9
Mean length8.7453333
Min length7

Characters and Unicode

Total characters39354
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowInpatient
2nd rowInpatient
3rd rowEmergency
4th rowRoutine
5th rowInpatient

Common Values

ValueCountFrequency (%)
Outpatient 1152
25.6%
Routine 1149
25.5%
Inpatient 1128
25.1%
Emergency 1071
23.8%

Length

2025-07-10T18:21:00.189299image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:21:00.300511image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
outpatient 1152
25.6%
routine 1149
25.5%
inpatient 1128
25.1%
emergency 1071
23.8%

Most occurring characters

ValueCountFrequency (%)
t 6861
17.4%
n 5628
14.3%
e 5571
14.2%
i 3429
8.7%
u 2301
 
5.8%
a 2280
 
5.8%
p 2280
 
5.8%
O 1152
 
2.9%
R 1149
 
2.9%
o 1149
 
2.9%
Other values (7) 7554
19.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 39354
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 6861
17.4%
n 5628
14.3%
e 5571
14.2%
i 3429
8.7%
u 2301
 
5.8%
a 2280
 
5.8%
p 2280
 
5.8%
O 1152
 
2.9%
R 1149
 
2.9%
o 1149
 
2.9%
Other values (7) 7554
19.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 39354
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 6861
17.4%
n 5628
14.3%
e 5571
14.2%
i 3429
8.7%
u 2301
 
5.8%
a 2280
 
5.8%
p 2280
 
5.8%
O 1152
 
2.9%
R 1149
 
2.9%
o 1149
 
2.9%
Other values (7) 7554
19.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 39354
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 6861
17.4%
n 5628
14.3%
e 5571
14.2%
i 3429
8.7%
u 2301
 
5.8%
a 2280
 
5.8%
p 2280
 
5.8%
O 1152
 
2.9%
R 1149
 
2.9%
o 1149
 
2.9%
Other values (7) 7554
19.2%
Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size274.0 KiB
Paper
1544 
Phone
1495 
Online
1461 

Length

Max length6
Median length5
Mean length5.3246667
Min length5

Characters and Unicode

Total characters23961
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPaper
2nd rowOnline
3rd rowOnline
4th rowPhone
5th rowPhone

Common Values

ValueCountFrequency (%)
Paper 1544
34.3%
Phone 1495
33.2%
Online 1461
32.5%

Length

2025-07-10T18:21:00.411766image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:21:00.507498image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
paper 1544
34.3%
phone 1495
33.2%
online 1461
32.5%

Most occurring characters

ValueCountFrequency (%)
e 4500
18.8%
n 4417
18.4%
P 3039
12.7%
p 1544
 
6.4%
a 1544
 
6.4%
r 1544
 
6.4%
h 1495
 
6.2%
o 1495
 
6.2%
O 1461
 
6.1%
l 1461
 
6.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 23961
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 4500
18.8%
n 4417
18.4%
P 3039
12.7%
p 1544
 
6.4%
a 1544
 
6.4%
r 1544
 
6.4%
h 1495
 
6.2%
o 1495
 
6.2%
O 1461
 
6.1%
l 1461
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 23961
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 4500
18.8%
n 4417
18.4%
P 3039
12.7%
p 1544
 
6.4%
a 1544
 
6.4%
r 1544
 
6.4%
h 1495
 
6.2%
o 1495
 
6.2%
O 1461
 
6.1%
l 1461
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 23961
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 4500
18.8%
n 4417
18.4%
P 3039
12.7%
p 1544
 
6.4%
a 1544
 
6.4%
r 1544
 
6.4%
h 1495
 
6.2%
o 1495
 
6.2%
O 1461
 
6.1%
l 1461
 
6.1%

cluster
Categorical

High correlation 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size255.0 KiB
3
1152 
0
1144 
2
1104 
1
1100 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4500
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row3
4th row2
5th row1

Common Values

ValueCountFrequency (%)
3 1152
25.6%
0 1144
25.4%
2 1104
24.5%
1 1100
24.4%

Length

2025-07-10T18:21:00.624236image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:21:00.719867image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
3 1152
25.6%
0 1144
25.4%
2 1104
24.5%
1 1100
24.4%

Most occurring characters

ValueCountFrequency (%)
3 1152
25.6%
0 1144
25.4%
2 1104
24.5%
1 1100
24.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4500
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 1152
25.6%
0 1144
25.4%
2 1104
24.5%
1 1100
24.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4500
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 1152
25.6%
0 1144
25.4%
2 1104
24.5%
1 1100
24.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4500
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 1152
25.6%
0 1144
25.4%
2 1104
24.5%
1 1100
24.4%

claimlegitimacy
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size293.2 KiB
Legitimate
4230 
Fraud
 
270

Length

Max length10
Median length10
Mean length9.7
Min length5

Characters and Unicode

Total characters43650
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLegitimate
2nd rowLegitimate
3rd rowLegitimate
4th rowLegitimate
5th rowLegitimate

Common Values

ValueCountFrequency (%)
Legitimate 4230
94.0%
Fraud 270
 
6.0%

Length

2025-07-10T18:21:00.826810image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-10T18:21:00.910259image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
legitimate 4230
94.0%
fraud 270
 
6.0%

Most occurring characters

ValueCountFrequency (%)
e 8460
19.4%
t 8460
19.4%
i 8460
19.4%
a 4500
10.3%
L 4230
9.7%
g 4230
9.7%
m 4230
9.7%
F 270
 
0.6%
r 270
 
0.6%
u 270
 
0.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43650
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 8460
19.4%
t 8460
19.4%
i 8460
19.4%
a 4500
10.3%
L 4230
9.7%
g 4230
9.7%
m 4230
9.7%
F 270
 
0.6%
r 270
 
0.6%
u 270
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43650
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 8460
19.4%
t 8460
19.4%
i 8460
19.4%
a 4500
10.3%
L 4230
9.7%
g 4230
9.7%
m 4230
9.7%
F 270
 
0.6%
r 270
 
0.6%
u 270
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43650
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 8460
19.4%
t 8460
19.4%
i 8460
19.4%
a 4500
10.3%
L 4230
9.7%
g 4230
9.7%
m 4230
9.7%
F 270
 
0.6%
r 270
 
0.6%
u 270
 
0.6%

Interactions

2025-07-10T18:20:54.193074image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:53.554068image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:53.873277image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:54.314570image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:53.661386image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:53.986474image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:54.427605image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:53.761685image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-10T18:20:54.093925image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-07-10T18:21:01.158192image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
claimamountclaimlegitimacyclaimstatusclaimsubmissionmethodclaimtypeclusterpatientagepatientemploymentstatuspatientgenderpatientincomepatientmaritalstatusproviderspecialty
claimamount1.0000.4060.0000.0060.0440.0220.0090.0070.0220.0190.0140.017
claimlegitimacy0.4061.0000.0000.0000.0000.4370.0300.0130.0090.4000.0000.000
claimstatus0.0000.0001.0000.0140.0000.0050.0160.0130.0000.0000.0180.024
claimsubmissionmethod0.0060.0000.0141.0000.0090.0000.0110.0000.0000.0260.0230.018
claimtype0.0440.0000.0000.0091.0000.0230.0000.0000.0000.0260.0000.000
cluster0.0220.4370.0050.0000.0231.0000.0290.0000.0040.9200.0000.000
patientage0.0090.0300.0160.0110.0000.0291.0000.0000.0000.0170.0000.014
patientemploymentstatus0.0070.0130.0130.0000.0000.0000.0001.0000.0000.0000.0000.000
patientgender0.0220.0090.0000.0000.0000.0040.0000.0001.0000.0280.0170.018
patientincome0.0190.4000.0000.0260.0260.9200.0170.0000.0281.0000.0080.000
patientmaritalstatus0.0140.0000.0180.0230.0000.0000.0000.0000.0170.0081.0000.000
providerspecialty0.0170.0000.0240.0180.0000.0000.0140.0000.0180.0000.0001.000

Missing values

2025-07-10T18:20:54.599171image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-07-10T18:20:54.742766image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

claimidpatientidprovideridclaimamountclaimdatediagnosiscodeprocedurecodepatientagepatientgenderproviderspecialtyclaimstatuspatientincomepatientmaritalstatuspatientemploymentstatusproviderlocationclaimtypeclaimsubmissionmethodclusterclaimlegitimacy
04d76c7f7-d36a-4139-b451-a9a4ad10d7d519cf2638-3ec0-4ed9-9995-d9ba4553813aa3d0cc80-dffe-40ff-a302-23c8ffeedb367820.522024-07-08 00:00:00Ta150iO01396FOrthopedicsPending57595.11SingleEmployedNew AlishaviewInpatientPaper3Legitimate
1e35193b4-3609-492b-866a-98de19317e9c5c4bb6c5-4dd3-4a86-85fa-f36c0d8debffa9f25acf-92b8-45e2-9cef-87bd07d0a5915453.862024-07-08 00:00:00Fo766jR34995MCardiologyDenied140772.72WidowedEmployedEast CurtisInpatientOnline2Legitimate
21f3fa373-25ed-4ff4-b6c7-38dcb2fb297f777866e0-4d10-45a8-a7b4-dbdaa26d5a81951b1e08-9948-4956-80e5-9277f16bd2908229.862024-07-08 00:00:00AX876uU47910MCardiologyPending69803.19MarriedStudentLake JenniferEmergencyOnline3Legitimate
3af6a68f4-8319-47b1-a28b-77de015728519d7c53ee-eb1a-4f07-9e3a-e86cf82e9f0fde9e193a-f9a1-4d63-9345-aefe756946289519.162024-07-08 00:00:00SQ441Xs26459FCardiologyPending135530.12MarriedEmployedMartinstadRoutinePhone2Legitimate
4417fe944-79d2-4610-81c4-a2d496f29ee4db14b0ca-ac2a-4e83-b085-947ea32e75875c7d7045-71b6-4c15-937c-43e4cfe65bf43226.152024-07-08 00:00:00FK970PV47636FNeurologyApproved36995.52DivorcedUnemployedThomasfurtInpatientPhone1Legitimate
541c69c3f-7b63-435c-841f-97633264a3479caba0e6-334d-4132-9330-1c1adaa82d1111ff25ad-29c9-493b-a356-cb0c6a8f41a63476.562024-07-07 00:00:00ZE958Am15926FCardiologyDenied96819.09DivorcedRetiredNorth MichaelOutpatientPaper0Legitimate
680a92d69-9d51-476c-8d1d-0ea35a7081a9c4daf0c4-8d67-4aba-97db-442a948db4d319d62078-bb03-4473-8815-5f814c12b5c86468.552024-07-07 00:00:00hg131vm2403MNeurologyDenied117271.04MarriedEmployedWest PaulEmergencyPaper2Legitimate
731c804c9-110c-4c26-bf38-2638b9e2952671d7c4ac-c608-4392-8f71-83ce85d00595c1bfab96-0df6-4a49-96e1-236bd3c6a7b5280.402024-07-07 00:00:00Xa559eD73399MPediatricsDenied125318.21WidowedEmployedAmbermouthInpatientPaper2Legitimate
825d801f8-d141-4131-9f1f-0c63360b4302919f254e-a7eb-41da-8f14-b2ee11aad6da8d1a5376-5ea6-42e6-beec-2c1313a30a494661.712024-07-07 00:00:00Sj663uq05857FNeurologyPending24263.98WidowedEmployedLarsonvilleInpatientPaper1Legitimate
98b5172a0-9aab-439a-9e3f-d0af5f2a1b6b6d707925-803e-42b8-af6e-e77d9f45dd8aed55392c-f9e3-469c-9367-00df827b1cf69638.642024-07-07 00:00:00Qu671Gw54991MPediatricsApproved78191.10WidowedUnemployedSouth JessicaburyOutpatientPhone3Legitimate
claimidpatientidprovideridclaimamountclaimdatediagnosiscodeprocedurecodepatientagepatientgenderproviderspecialtyclaimstatuspatientincomepatientmaritalstatuspatientemploymentstatusproviderlocationclaimtypeclaimsubmissionmethodclusterclaimlegitimacy
4490a996469c-9d91-437d-9a13-33384ec86e276c35f381-e4fc-4d5b-a4e4-fc8abeed154a04a1cc5f-73cc-4bca-b6f0-af3fc2529ec45879.352022-07-10 00:00:00FC909jB02084FCardiologyDenied129931.42DivorcedUnemployedJacobsbergOutpatientPaper2Legitimate
44916c85c4f1-bc4b-46fb-a573-3fea7853cb3805ec1ef6-bef2-43af-befa-c00eb68af7a967cd32ac-9518-417a-aa8e-bc46911951e46250.802022-07-10 00:00:00aW066su89691MGeneral PracticePending43191.77MarriedEmployedLake EdwardmouthRoutinePhone1Legitimate
44924e2838d1-2819-441f-a435-97c22b9f4e8bd708e793-c837-40e4-8d4c-852a94b4e87f3fc3e1bd-a685-4a9e-8014-1d81a6eaffe58290.292022-07-10 00:00:00tl190mO87022MOrthopedicsDenied86328.07WidowedRetiredGarcialandInpatientPhone0Legitimate
44938a2ffe23-6145-49b3-89a4-1013c4e858e2516ea776-3d69-4c5f-a9d2-bae1698b28d22fcfedff-0300-4c6d-82ba-cb4858c7487f9102.272022-07-09 00:00:00ZQ868GC2020MOrthopedicsApproved48185.86MarriedStudentSouth AnthonyfurtEmergencyPhone1Fraud
44944c4e4abc-e65d-485e-9882-c44485e63917f3697794-b8d7-4c0d-a18c-72e5cab95d9598f91962-bcf3-482b-8ea7-f003d74c86ae1189.512022-07-09 00:00:00Ux531bJ95668FPediatricsDenied108225.81MarriedEmployedHerreraboroughRoutinePaper0Legitimate
44956c427360-20ae-43b8-802f-bd25fae3ce09c0ddd919-1b16-4689-9963-7566ba410835c0039b67-ace3-4f97-a646-4214419f9fdf3041.502022-07-09 00:00:00qJ110bn80610MGeneral PracticeDenied80395.76WidowedStudentNew MelissastadEmergencyPaper3Legitimate
449643b72c25-94ae-4f1f-a2fb-cb329797867402ea4377-cf98-4251-a1d6-8eb720d903d82dcbfa56-e73a-42b5-bfbf-02bfb9b3f9905153.282022-07-09 00:00:00dc670wX32996FNeurologyPending31560.84WidowedRetiredLake CathymouthOutpatientPhone1Legitimate
4497e0bf8e55-7440-48bb-9583-187ab12a568214844cfb-2bff-4be5-8540-7d58c72ed309ae3fdf78-c574-495a-ba8a-2246ba1d61a56908.452022-07-09 00:00:00cF152aT40297FPediatricsDenied74973.94MarriedUnemployedGaryboroughInpatientOnline3Legitimate
44981a3f947a-f3a7-4286-8925-aed2eced6ee2cfedbf0b-43eb-4dbe-a26b-74bd566898c8d344683d-f2e2-4262-8c04-f9e92fda1d335830.192022-07-09 00:00:00Sc398wv34214FGeneral PracticeApproved147665.80WidowedStudentEast ClaudiafurtRoutinePaper2Legitimate
4499291cfa64-9956-40e7-b89f-4628650f42f02bd2d173-4ce1-428d-836c-259d9236a839cf84cf99-0ac3-465a-af90-239a873bafa55848.922022-07-09 00:00:00TQ972Sn27347MPediatricsApproved131676.02MarriedUnemployedNorth AmberboroughInpatientPhone2Legitimate